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Abstract 

Every system of any significant size is created by composition from 
smaller sub-systems or components. It is thus fruitful to analyze the 
fault-tolerance of a system as a function of its composition. In this 
paper, two basic types of system composition are described, and an 
algebra to describe fault tolerance of composed systems is derived. The 
set of systems forms monoids under the two composition operators, 
and a semiring when both are concerned. A partial ordering relation 
between systems is used to compare their fault-tolerance behaviors. 
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1 Introduction 

Fault tolerance is a subject of immense importance in the design and anal- 
ysis of many kinds of systems. It has been studied in distributed comput- 
ing [TU [TOj |6] and elsewhere in computer science [3j H] , in the context of 
safety-critical systems [TTJ [T3], and in many other places. Authors such as 
Perrow [Hj and Neumann [13] note many instances where standard assump- 
tions made in designing fault-tolerant systems seem not to hold, and draw 
conclusions from these deviations. 

In this paper, we take a different, algebraic view of fault tolerance, based 
on the composition of a system. The basic notion is that every system of 
any significant size is created by composition from smaller sub-systems or 
components. The fault-tolerance of the overall system is influenced by that 
of the components that underlie it. Conversely, a system may itself be a 
module or part of a larger system, so that its fault-tolerance affects that of 
the whole of which it is a part. 

It is thus fruitful to analyze the fault tolerance of a system as a func- 
tion of its composition. In the preliminary analysis, it is assumed that 
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component failures occur independently — failure of one component does not 
automatically imply the failure of another. 

Composition of components to create a larger system is considered in 
Section [2] to happen in two ways: direct sum, denoted +, and direct prod- 
uct, denoted x. This is then used to describe an arithmetic on systems. 
The fault tolerance of systems is formally described in Section [3] using func- 
tions from the set of systems to the set of natural numbers, whose basic 
properties are obtained. In Section [U the arithmetic on systems is extended 
to define system monoids by direct sum and direct product, and later, a 
system semiring considering both. Using this as a basis, a partial ordering 
of systems by fault tolerance is given in Section [51 and basic properties are 
derived. In Section El it is shown that monoids can be defined on the set of 
equivalence classes under fault tolerance, of systems (rather than on the set 
of systems as done in Section 0]) . Section [7] briefly indicates how this could 
be used to create a semiring on the set of equivalence classes of systems 
under worst-case fault tolerance, essentially allowing for the development of 
results analogous to those of Section [5j 

The mathematics used is standard algebra as might be learned in a 
first-level graduate or upper-level undergraduate course, and is mostly self- 
explanatory. Standard works in the field such as Hungerford [§] or Lang , 
and monographs on semiring theory [8] may be useful references. 

2 Notation and Preliminaries 

We use lowercase p, q, etc., to denote individual components that are as- 
sumed to be atomic at the level of discussion, i.e., they have no components 
or sub-systems of their own. Systems that are not necessarily atomic are 
denoted by A, B, etc., with or without subscripts. Where particular clarity 
is required, component will be used to refer to an atomic component, while 
subsystem will be used to refer to a component that is not atomic. Sets of 
systems — atomic or not — are denoted by V, Q, etc., and in particular U is 
the universal set of all systems in the domain of discourse. The set of natural 
numbers is denoted by N, and the set of integers by Z. Other mathematical 
terms and notation are defined in the sections where they are first found. 

In the beginning, we assume that our components and subsystems are 
disjoint, i.e., that they do not share any components among themselves, and 
that they fail, if they do, independently of one another. 

A fault is a failure of a subsystem or component, while a failure applies 
to some system as a whole, but the latter term will also be applied in the 
context of events that are faults from a larger system perspective but may 
be considered failures from a component or subsystem perspective. 

Let A and B be two components that can form a system. 
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Definition 2.1. The + operator is considered to apply for systems consist- 
ing of two components when the failure of either would cause the system 
as a whole to fail. Equivalently, we may say that the direct sum A + B of 
components A and B is a system where the failure of either A or B will 
cause overall system failure. 

One example is a computer program that needs two resources, such as 
two files, in order to perform, with lack of functionality resulting from the 
lack of availability of either. 

A more basic example of such a system is two light-bulbs connected in 
series, with a voltage applied to cause them to glow. If either bulb burns 
out, the connection is broken and system failure occurs. 

In colloquial terms, a direct sum is a situation where the "weakest link" 
component needs to fail, for the system to fail. 

Definition 2.2. The x operator applies for systems consisting of two com- 
ponents when the failure of both is necessary to cause the sytem as a whole 
to fail. Equivalently, we may say that the direct product A x B of compo- 
nents p and q is a system where the failure of both is necessary to cause 
overall system failure. 

One example is a computer program that needs a file that is available in 
two replicas; the program would only fail to perform if both replicas were 
to be unavailable. 

A more basic example of such a system is one consisting of two light-bulbs 
connected in parallel across a voltage source, with the system considered 
functional if at least one bulb glows. System failure occurs only when both 
bulbs burn out. 

In colloquial terms, a direct product is a situation where the "strongest 
link" component needs to fail, for the system to fail. 

Note that these composition operators are different from the way com- 
position is described elsewhere in the literature [HE], and are also not sig- 
nificantly related to the theory of "fault tolerance components" proposed by 
Arora and Kulkarni [5|. 

Given the previous definitions of the direct sum and direct product oper- 
ators, we obtain an "arithmetic" on systems as follows |16j . where = stands 
for isomorphism. 

Proposition 2.3. Given disjoint components A, B, and C, we have the 
following. 

(i) + and x are commutative and associative: A x B = B x A; A + C = 
C + A- A+ (B + C) = {A + B) + C; and A x (B x C) = {A x B) x C. 

(ii) x distributes over +: A x (B + C) = (A x B) + (A x C). 
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Proof. The properties given in (i) are obvious from the Definitions 12.11 
andE21 

For the distributive property of (ii), consider that Ax (B + C) will 
fail when both A and B + C fail, and that this happens precisely when 
(A x B) + (A x C) fails. □ 

Therefore, systems are described by polynomial expressions on variables 
denoting their components, such as (P 3 + 3Q) x 5R 2 . 

In addition, by analogy with regular arithmetic, we can say the following. 

Proposition 2.4. Any system described by an expression of direct sum and 
product operations on components Aij can be expressed in a canonical sum 
of products form such as {Aq o x Aq i x . . . x Aq no -i} + {^1 o x An x . . . x 
A x 

,m— l 

} + {A m _ lfi X Am-1,1 X ... x ^ m _i ) „ m _ 1 _i}. 

In Proposition 12.41 there are m product terms to be added together, and 
a product term i consists of the direct product of rij components. 



3 Fault Tolerance of Composed Systems 

About systems that are members of U, we can state the following results 
from [16], with A n describing the direct product of A E U with itself n times, 
and nA describing the direct sum of A E U with itself n times. 

Lemma 3.1. A system A n can tolerate faults in n — 1 of the subsystems, 
where n € Z+, and a system nA can tolerate zero faults. 

Proof. This is immediate, given the definitions of the direct sum and prod- 
uct. A system composed of the direct sum of A with itself n times will fail 
should any one of the components fail, whilst one composed of the direct 
product of A with itself n times will only fail if all of them fail (i.e., it can 
tolerate faults in n — 1 of them) . □ 

As a consequence, we have the following. 

Corollary 3.2. A system mA n consists ofmxn subsystems, where m, n G 
Z+, but can only tolerate faults in n — 1 of them in the worst case. 

Notice, however, that the following holds. 

Corollary 3.3. In the best case a system mA n can tolerate faults in m x 
(n — 1) components. 

Proof. In the best case, a system mA n = A n + . . . + A n would have (n — 1) 
failures in each of the A n and still be functional, for a tolerance ofmx(n-l) 
faults. □ 
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In the previous results, we have dealt with just one type of component. 
However, the same idea can be generalized to multiple types of components, 
as follows. It is assumed here that a failure of any component, regardless of 
its type, is of equal value and adds 1 to the number of failures in the overall 
system. 

The idea is to specify the fault-tolerance (in the best case and in the 
worst case) recursively, in terms of sub-systems. 

Definition 3.4. If, as stated, previously, U is the set of all systems, then 
§ best : U -> N and $ 

worst '■ U — > N are functions giving the best- and 
worst-case fault-tolerances of a system, such that: 

(i) ®best(A) = i for A G U, if i is the cardinality of the largest set of 
components of A that can fail without causing A to fail. 

(ii) ^worst(B) = j for B € U, if j + 1 is the cardinality of the smallest set 
of components in B whose failure causes B to fail. 

Quite obviously, we have & best (A) > & worst (A), MA € U. The values of 
these functions may be computed recursively as indicated in the following 
theorem. 

Theorem 3.5. Given systems A,B^U, if \A\ and |f?| are the numbers of 
components therein, the following hold: 

(i) <S> best (A xB) = m&x{$ best (A) + \A\ + <S> bes t(B)}; 

(ii) <5> WO rst(A x B) = <S> worst {A) + <S> worst (B) + 1; 

(iii) $> best {A + B) = § best {A) + ® bes t(B); 

(iv) $ WO rst{A + B)= mm{$ worst (A), <f> WO rst{B)} . 

Proof. For parts (i) and (ii), we know that A x B will fail only when both 
A and B fail. 

Therefore, <& bes t(A x B) reflects the case where one of A or B fails fully 
(every component in it failing), and the other has its best-case limit of 
component failures. Therefore, the total number is m.&x{<&i, est (A) + \B\, \A\ + 
$ best (B)} failures. Likewise, ® worst {A x B) = ® worst (A) + $ worst (B) + 1. 

For parts (iii) and (iv), we know that a system A + B will fail should 
either A or B fail. 

Therefore, <& best (A + B) = & best (A) + & best (B), because in the best case 
A and B will sustain their limit of failures and yet not fail. 

Likewise, ® worst (A + B) = mm{® worst (A),$ worst (B)}, as in the worst 
case the minimum of mm{<& worst (A), & W orst(B)} is all that is necessary to 
cause one of the two, A or B, to fail. □ 
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The theorem just stated can also be extended in the obvious way to 
systems of n > 2 components. 

Theorem 3.6. Given subsystems Ai £ U, < i < n — 1, we have the 
following, with II and £ denoting, respectively, the direct product and direct 
sum of multiple subsystems, and \Ai\ denoting the number of components 
in subsystem A^. 

(i) For the best-case tolerance of a product of n subsystems: 

n— 1 n-1 

^bestill Ai) = \ A i\ ~ \ A k\ + $best(A k ), 

i=0 i=0 

where A^ is a subsystem with the lowest difference between its own 
number of components and its own best-case fault tolerance. 

(ii) For the worst-case tolerance of a product of n subsystems: 

n— 1 n— 1 

$>worst( II A i) = ®™orst{Ai) +71-1 
i=0 i=0 

(hi) For the best-case tolerance of a sum of n subsystems: 

71—1 71—1 

<5>best{Y. A i) = $best{Ai) 
i=0 i=0 

(iv) For the worst-case tolerance of a sum of n subsystems: 

n-1 

worst 

i=0 

Proof. The proof is exactly similar to that of Theorem 13.51 though a little 
more involved. 

For parts (i) and (ii), we know that the system will fail when all of the 
Ai subsystems fail. 

Therefore, the best-case fault tolerance of the product of the Ai is when 
all the subsystems save one fail, and collectively sustain the maximum possi- 
ble number of component failures. If the one that does not fail also sustains 
component failures, then the best-case fault tolerance is obtained when the 
non- failing subsystem has the lowest difference, among all subsystems Ai, 
between its own number of components and its best-case fault tolerance. 

Likewise, the worst-case fault tolerance of the product of the Ai is when 
all subsystems but one fail, but sustain the least possible number of compo- 
nent failures in doing so. This happens when each Ai sustains its worst-case 
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number of failures, and then further when n — 1 of them sustain one more 
failure each, meaning that all but one subsystem have failed. 

For parts (m) and (iv), we know that the system will fail when any one 
subsystem A4 fails. 

Therefore, the best-case fault tolerance of the sum of the Ai is when 
each Ai sustains its best-case limit of failures. Likewise, the worst-case fault 
tolerance of the sum of the Ai is when just one Ai sustains its worst-case 
limit of failures. □ 



4 System Monoids and Semirings 

Given the system arithmetic previously defined, we can posit the existence 
of two identity operators, one each for the direct sum and direct product. 
Informally, an identity element is one which leaves the system unchanged, 
under the relevant operator. 

Definition 4.1. The multiplicative and additive identities are defined as 
follows. 

(i) The additive identity is the system such that for any system A, 
A + = + A = A. 

(ii) The multiplicative identity 1 is the system such that for any system 
A, Axl = lxA = A. 

By the commutativity of the + and x operators, we observe that the 
identity elements are two-sided. 

Informally, we may describe these elements as follows: 

(i) The additive identity is a system "that is always up." The direct 
sum of such a system and A is obviously A itself. 

(ii) The multiplicative identity 1 is a system "that is always down." The 
direct product of such a system and A is likewise A itself. 

Then U, combined with the + operator, is a monoid (a set with an 
associative operator and a two-sided identity element) [9]. Similarly, U is 
also a monoid when considering the x operator. For notational convenience, 
we denote these monoids as (U, +) and (U, x). 

It is further clear that the set U is a semiring when taken with the 
operations + and x because the following conditions [7| for being a semiring 
are satisfied: 

(i) (U, +) is a commutative monoid with identity element 0; 

(ii) (U, x) is a monoid with identity element 1; 
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(iii) x distributes over + from either side; 

(iv) x A = = A x for all A G U. 

This system semiring will be denoted by (14, +, x), and its properties are 
as indicated in the following. 

Remark 4.2. The semiring (U,+, x) is zerosumfree, because A + B = 
implies, for all A,BeU, that A = B = 0. 

This condition shows [7] that the monoid (U, +) is completely removed 
from being a group, because no non-trivial element in it has an inverse. 

Remark 4.3. (U, +, x) is entire, because there are no non-zero elements 
A,B eU such that Ax B = 0. 

This likewise shows that the monoid (U, x) is completely removed from 
being a group, as there is no non-trivial multiplicative inverse. 

Remark 4.4. ilA, +, x) is simple, because 1 is infinite, i.e., A+l = 1,VA G 
U. 

Given that the semiring (U, +, x) is both zerosumfree and entire, we can 
call it an information algebra [Tj- 

We may state another important definition [7] about semirings, and ob- 
serve a property of (U, +, x). 

Definition 4.5. The center C{U) of U is the set {A Gli\AxB = Bx 
A, for all B G U}. 

Remark 4.6. The semiring {U,+, x) is commutative because C{U) = U. 

5 Fault- Tolerance Partial Ordering 

Consider a partial ordering relation ^ onW, the set of all systems, such that 
(U, =4) is a poset. This is a fault-tolerance partial ordering where A ^ B 
means that A has a lower measure of some fault metric than B (e.g., A has 
fewer failures per hour than B, or has a better fault tolerance than B). 

Formally, ^ is a partial ordering on the semiring {Li, +, x) if the following 
conditions are satisfied [8]. 

Definition 5.1. If (U,+,x) is a semiring and (JA,^) is a poset, then 
{U,+, x,=4) is a partially ordered semiring if the following conditions are 
satisfied for all A, B, and C in U. 

(i) The monotony law of addition: 

A4B — ► A + C 4 B + C 
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(ii) The monotony law of multiplication: 

A 4 B — > Ax C 4 B x C. 

It is assumed that ^ A,VA E U, on the reasoning that 0, being a 
system that never fails, must have the least possible measure of any fault 
metric. Similarly, ^4^1, because 1, being a system that is always down, 
must have the greatest possible measure of any fault metric. 

Given Definition 15.11 it is instructive to consider the behavior of the 
partial order under composition. We begin with a couple of simple results. 

Lemma 5.2. If ^ is a fault-tolerance partial order as defined, then VA, B £ 
U: 

(i) A 4 A + B, and 

(ii) A x B 4 A. 

Proof. For (i), consider that =^ B. Using the monotony law of addition, 
we get + A 4 B + A. Considering that is the additive identity element 
and that addition is commutative, we get A 4 A + B. 

For (ii), consider that B 4 1- Using the monotony law of multiplication, 
we get B x A 4 1 x A. Considering that 1 is the multiplicative identity 
element and that multiplication is commutative, we get A x B 4 A. □ 

A property of U in consideration of the + operator can now be noted. 

Remark 5.3. The positive cone of (U,+,4), which is the set of elements 
A € U for which A 4 A + B,VB E U, is the set U itself. The negative cone 
is empty. 

This is a direct consequence of Lemma 15.21 (i), and it also follows that 
the set of elements {A \ A + B 4 A} = 0, showing that the negative cone is 
empty. 

The analogous property of IA in consideration of the x operator can also 
be noted. 

Remark 5.4. The negative cone of (U, x, which is the set of elements 
A G U for which Ax B 4 A,VB € U, is the set U itself. The positive cone 
is empty. 

As before, this is a direct consequence of Lemma [5.21 (ii), and it likewise 
also follows that the set of elements {^4 | A 4 A x B} = 0, showing that the 
positive cone is empty. 

Theorem 5.5. If A + B 4 C, then A 4 C and B 4 C. 
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Proof. The proof is by contradiction. Assume the contrary. Then A + B 4 
C, and at least one of A =4 C or B 4 C is false. 

Without loss of generality, assume that C =4 A. Using the monotony law 
of addition and the commutativity of the + operator, B + C =<! A + B. 

Now, by Lemma [5721 (i). C 4 B + C . Given the transitivity of =<!, we get 
C 4 A + B, which is a contradiction. □ 

An analogous result can also be stated in terms of the product, as follows. 

Theorem 5.6. If A 4 B x C, then A 4 B and A 4 C. 

Proof. The proof is again by contradiction. Assume the contrary. Then 
A 4 B x C and at least one of A 4 B and A 4 C is false. 

Without loss of generality, assume that B 4 A. Using the monotony law 
of multiplication and the commutativity of the x operator, we get B x C =4 
AxC. 

Now, by Lemma 15.21 (ii), A x C ^ 4. Given the transitivity of we 
get B x C ^ 4, which is a contradiction. □ 

The following generalization of Lemma 15.21 can be made. 

Lemma 5.7. If ^ is a fault-tolerance partial order, then Vn E Z + and 
MA G U, 

(i) A 4 n A, and 

(ii) A n 4 A. 

Proof. For (i), consider that by Lemma 15.21 (i) we have A 4 2 A (just set B 
to be A in the Lemma). Likewise, kA =4 (k + for any k > 2. By the 
transitivity of =3!, we therefore have the result. 

For (ii), the reasoning is very similar, using Lemma 15.21 (ii) and the 
transitivity of □ 

The following is an obvious corollary. 

Corollary 5.8. If ^ is a fault-tolerance partial order, then Vm,n € Z + and 
MA G U, if m < n, we have: 

(i) mA 4 n A, and 

(ii) A n 4 A m . 

Similarly, the following corollary generalizing Theorems 15.51 and 15.61 ap- 
plies. The proof is omitted as obvious. 

Corollary 5.9. The following hold for all A,B<EU and all n e Z + : 
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(i) if nA 4 B, then A 4 B; and 

(ii) if A 4 B n , then A 4 B. 

In a system semiring, the following result concerning preservation of 
'fault-tolerance behavior under composition also applies. 

Theorem 5.10. If ^ is a partial order as described, and if A 4 B and 

C 4 D, then, 

(i) A + C 4 B + D, and 

(ii) A x C 4 B x D. 

Proof. These results can be proven directly. Only (i) is proved, the proof of 
(ii) being very similar. 
We know the following: 

A^B (1) 

and: 

C 4 D (2) 

From dU) and the monotony law of addition (considering the direct sum 
of D and both sides), we have: 

A + D^B + D. (3) 

Similarly, from ([2]) and the monotony law (considering the direct sum of 
A and both sides), we have: 

A + C 4 A + D. (4) 

By considering transitivity in respect of (jlj) and ([3]), we get: 

A + C 4B + D. 

QED. □ 
The following corollary can be stated. 

Corollary 5.11. If =<! is a fault-tolerance partial order and if A 4 B, then 

Vn e Z + , 

(i) nA =4 TiB, and 

(ii) A n 4 B n . 
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Proof. These are straight-forward iterated consequences of Theorem 15.101 
and the transitivity of □ 

In consideration of the transitivity of the partial order the following 
generalization of Theorem I5.1UI applies. The proof is immediate. 

Theorem 5.12. If A { 4 B i: with < i < n - 1 and A i} Bi G U, then: 

n—l n—l 

A 4 Yl B ^ 

i=0 i=0 

and 

n—l n—l 

n Ai ^ n B i- 

i=0 i=0 

The following is a result along the same lines as the previous two. 
Theorem 5.13. If A 4 B and m < n, then: 

(i) mA 4 TiB; and 

(ii) A n 4 B m . 

Note that A m and B n cannot be compared just based on the data given. 

Proof. For (i), we have that mA 4 mB by Corollary 15.111 (i), because 
A 4 B. We further have that mB 4 nB by Corollary 15.81 (i), because 
m < n. By the transitivity of ^, we have the desired result. 

For (ii), we have that A n 4 A m by Corollary 15.81 (ii), because m < n. 
We further have that A m 4 B m by Corollary 15.111 (ii), because A 4 B. As 
before, by the transitivity of ^, we have the desired result. □ 

These results can be generalized in the obvious way, as follows. 

Theorem 5.14. If A4 =<! Ai+i,m < TH+i, and m« > mj+i, for some range 
of values i, then we have: 

mA? 4 n l+1 A^ 

Proof. It is clear that A™ % 4 ^-i+i" 1 ) noting that A4 4 A+i 5 m i > m i+i, 
and applying Theorem 15.131 (ii). Now, given that A™' 4 ^j, 1 ^ 1 , the fact 
that rii < Ui+i, in light of Theorem 15.131 (i), gives the result. □ 

This theorem leads to the following obvious corollary. 

Corollary 5.15. If m < n, then mA n 4 nA m . 
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6 Monoids of Fault- Tolerance Equivalence Classes 



It has previously been noted in Section U] that the set of all systems, along 
with the direct sum (respectively: direct product) and the additive (respec- 
tively: multiplicative) identity defines a monoid. Another kind of monoid 
may be defined on classes of systems, as follows. 

We define an equivalence relation on fault tolerance, as follows. 

Definition 6.1. A fault-tolerance equivalence relation by worst-case fault 
tolerance, 7£(~), is given by A ~ B — > Q worst (A) = & worst (B) for all 
A,B eU. 

Based on this, we note a simple theorem, given in |16j . 
Theorem 6.2. A\ ~ B\ and A 2 ~ B 2 together imply: 

(i) Ai + A 2 ~ Bx + B 2 ; and 

(ii) A 1 x A 2 ~ Bi x B 2 . 

Proof. For part (i), we note from Theorem 13.51 (iv) that $> WO rst(Ai + A 2 ) = 
min.{$ worst (Ai) + ® WO rst (A 2 ) }. Let this be ® worst (Ai). Similarly, then, 
^worst{B\ + B 2 ) = ^ W orst(Bi), and the result obtains. 

For part (ii), we note from Theorem 13.51 (ii) that $ W orst{A\ x A 2 ) = 

*&worst 

(Ai) + $ wors t (A 2 ) + 1 . This is also equal to § worst (-Bi) + $ 'worst (B 2 ) + l 
by the definition of 7Z, which in its turn is equal to <& W orst{Bi x B 2 ). □ 

Now we can state the major result (given for monoids by Hungerford [S]), 
on constructing a monoid on the equivalence classes of systems. The proof 
given here is as given by Hungerford. 

Theorem 6.3. Let 7£(~) be defined as in Definition 16.11 Then the set 
IA/1Z of all equivalence classes of U under 1Z is a monoid under the binary 
operation defined by AfBB = A + B, where A denotes the equivalence class 
of A G 2 U . 

Proof. If A\ = A 2 and B\ = B 2 , where Ai,Bi € U, then A\ ~ A 2 and 
B 1 ~ B 2 . By Theorem O part (i), + S x ~ A 2 + 5 2 , so that A 1 + Si = 
^2 + -62- Therefore, the binary operation EE in U/1Z is well-defined (i.e., it 
is independent of the choice of equivalence-class representatives). It is asso- 
ciative since A + (B + C) = Am(B~mC) = Am (B + C) = (A + B) + C = 
(A + B) EE C = ( A EE B) EE C. 

The identity element is 0, the equivalence class of all systems that are 
always up, since AEB = A + = A. Therefore, (U/1Z, EE) is a monoid. □ 

An analogous theorem can also be stated in respect of the x operator, 
with an identity element 1, the equivalence class of systems that are always 
up. The proof is exactly similar and is thus omitted. 
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Theorem 6.4. Let U be the set of all systems, and the relation 7£(~) be 
defined as in Theorem 16.21 Then the set U /1Z of all equivalence classes of 
U under 1Z is a monoid under the binary operation defined by (^4) M (B) = 
A x B, where A C U denotes the equivalence class of A. 

Therefore, we have a second monoid, {U/1Z, M). 

Remark 6.5. Note that U/1Z is a set of classes of systems, rather than of 
systems themselves. Therefore, U/1Z C 2 W , and any A C U (or: A E 2 W ), 
where A €U, and where A denotes the equivalence class of A by worst-case 
fault tolerance. 

It is also possible to re- work Theorem l6.3[ considering the best-case fault 
tolerance. To show how, we first state the analogue of Theorem I6.21 

Theorem 6.6. If we define an equivalence relation 7£(~) on systems by 
best-case fault tolerance, such that A ~ B — > &best{A) = ^best(B), then 
A\ ~ B\ and A2 ~ B 2 imply A\ + A 2 ~ B\ + B2. 

Proof. We note from Theorem [331 (Hi) that ^best{M + A 2 ) = ^best(Ai) + 
<S>best(A 2 ). If A x ~ B 1 andA 2 ~ B 2 , then $ beS i(Ai)+$ beS i(A 2 ) = $ besi (5i)+ 
$best(-B2)- This latter expression is <&best{B\ +B 2 ), giving us the result. □ 

Note that the analogue of Theorem l6.2( ii) is not generally true — if 7£(~) 
denotes an equivalence relation by best-case fault-tolerance, then A\ ~ B\ 
and A 2 ~ B 2 do not imply A\ x A 2 ~ B\ x B 2 . 

Therefore, we can re-state Theorem 16.31 (but not Theorem I6.4D using 
the new definition of 1Z. The statement and proof run just as previously, 
however, so we do not belabor the point. 

We may summarize the results of this section, however, to note that 
there are three types of equivalence-class monoids so obtained: 

• the monoid of worst-case equivalence classes under direct sums; 

• the monoid of worst-case equivalence classes under direct products; 
and 

• the monoid of best-case equivalence classes under direct sums. 

7 The Semiring of Fault-Tolerance Equivalence Classes 

It has been shown previously in Theorems 16.31 and 16.41 that there exist 
monoids (U/7Z, EE) and (U/7Z,M) on the equivalence classes of systems by 
worst-case fault tolerances, and that these are commutative monoids. 

It is easily seen that the other two conditions for a semiring [7] (see 
Section H]) are also satisfied: 
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• M distributes over EB — for any A, B, C G 2 , we have: 

(A^B)m(A^C) = (A x B) ffl (A x C) 
= {AxB) + (AxC) 
= Ax(B + C) 
= AM (BSC) 



. OMA = Ox A = = AMO,iov all ACU 

Therefore, we can consider a different semiring, (2 u ,m,M), and it turns 
out that as with (U,+, x), this too is zerosumfree, entire, simple, and com- 
mutative (compare with the corresponding Remarks 14.21 14.31 14.41 and 14.61) . 

It is further possible to define a partial ordering relation (denoted by the 
symbol <, for example) comparing the fault tolerances of different classes 
of systems. All of Section [5] can thus be repeated with 2 U in place of U, A 
and such in place of A, and < in place of 
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